Treeful Docs

Author

Christoph Trost

Published

June 20, 2023

About this Book

This book is the documentation for Treeful, an R Shiny application helping people plant trees based on species distribution modelling.

Contribute

Who is behind this

  • Jakob Kutsch
  • Christoph Trost

About this Project and technical Approach

This project was funded from the German ministry for Education and Research from March 1st 2023 to September 1st 2023. The purpose is to build a shiny app that allows user to explore habitat shapes of trees in Europe. These habitat shapes are plots of existing trees from a merged database of close to 9 million trees. We extracted climatic variables present at each tree location. These shapes allow users to compare with their own location, a potential planting site, for the past and future.

Infrastructure

The docker-compose file

ETL Container

This container runs through the main script, fetching tree location data, getting bioclimatic variables, extracting those from the tree locations and writing everything to Postgis.

Postgis Database Container

All processed data from the ETL pipeline ends up in here. We chose to also write rasters to postgis, although rasters for projects like this are typically stored on disk. Our final shiny app will never read an entire raster but merely extract values from user locations.

Shiny Container

The actual app runs as docker swarm and connects to the Postgis database.

Treeful ETL Pipeline: Obtaining and transforming tree occurrence data

This section documents how tree occurrence data is obtained. At the end, we will have our tree locations with the corresponding bioclimatic variables in a postgis database. Additionally we write rasters and a few master data tables into postgis.

Get Climate Projection data & other external datasets

First, we obtain data not included in this repository, namely EU Forest and Copernicus CDS raster files. These downloads will take a while. (Some datasets are neither downloaded, nor included in this repo. They’ve been shared privately with us)

We have used Copernicus CMIP5 regional projections, with RPC4.5 and RPC8.5 experiments from the NorESM1-M (NCC, Norway) model. The datasets can be built and obtained here. Our code contains a short helper script for the automated download (set env vars for CDS login before)

Code
#################### Trees4F from figshare ###################
if (!file.exists("2_Data/0_raw_data/EUforestspecies_AMauri.csv")) {
  download.file("https://springernature.figshare.com/ndownloader/files/6662535", destfile = "2_Data/0_raw_data/EUforestspecies_AMauri.csv")
}

#################### COPERNICUS CDS Download ###################
# REGISTER YOURSELF AND ENTER DETAILS FROM HERE:
# https://cds.climate.copernicus.eu/user/register
# Beware the user is the UID at the bottom of your user page.

uid <- Sys.getenv("COPERNICUS_UID")

wf_set_key(user=uid, # 
           key=Sys.getenv("COPERNICUS_KEY"), 
           service="cds")

request <- list(
  region = "europe",
  variable = c("annual_mean_temperature", "annual_precipitation", "isothermality", "maximum_temperature_of_warmest_month", 
               "mean_diurnal_range", "mean_temperature_of_coldest_quarter", "mean_temperature_of_driest_quarter", "mean_temperature_of_warmest_quarter", 
               "mean_temperature_of_wettest_quarter", "minimum_temperature_of_coldest_month", "precipitation_of_coldest_quarter", "precipitation_of_driest_month", 
               "precipitation_of_driest_quarter", "precipitation_of_warmest_quarter", "precipitation_of_wettest_month",
               "precipitation_of_wettest_quarter", "precipitation_seasonality", "temperature_annual_range", "temperature_seasonality", "volumetric_soil_water", "aridity"),
  derived_variable = "annual_mean",
  model = "noresm1_m",
  ensemble_member = "r1i1p1",
  experiment = c("rcp4_5", "rcp8_5"),
  statistic = "mean",
  version = "1.0",
  format = "zip",
  dataset_short_name = "sis-biodiversity-cmip5-regional",
  target = "download.zip"
)

ncfile <- wf_request(
  user = uid,
  request = request,   
  transfer = TRUE,  
  path = "2_Data/0_raw_data/future",
  verbose = FALSE
)


request <- list(
  region = "europe",
  origin = "era5",
  variable = c("annual_mean_temperature", "annual_precipitation", "isothermality", "maximum_temperature_of_warmest_month", "mean_diurnal_range", "mean_temperature_of_coldest_quarter", "mean_temperature_of_driest_quarter", "mean_temperature_of_warmest_quarter", "mean_temperature_of_wettest_quarter", "minimum_temperature_of_coldest_month", "precipitation_of_coldest_quarter", "precipitation_of_driest_month", "precipitation_of_driest_quarter", "precipitation_of_warmest_quarter", "precipitation_of_wettest_month", 
               "precipitation_of_wettest_quarter", "precipitation_seasonality", "temperature_annual_range", "temperature_seasonality", "volumetric_soil_water"),
  derived_variable = "annual_mean",
  statistic = "mean",
  version = "1.0",
  format = "zip",
  dataset_short_name = "sis-biodiversity-era5-regional",
  target = "download.zip"
)

ncfile <- wf_request(
  user = uid,
  request = request,   
  transfer = TRUE,  
  path = "2_Data/0_raw_data/past",
  verbose = FALSE
)


if (file.exists("2_Data/0_raw_data/past/download.zip")) {
  utils::unzip("2_Data/0_raw_data/past/download.zip", exdir = "2_Data/0_raw_data/past/")
  file.remove("2_Data/0_raw_data/past/download.zip")
}
if (file.exists("2_Data/0_raw_data/future/download.zip")) {
  utils::unzip("2_Data/0_raw_data/future/download.zip", exdir = "2_Data/0_raw_data/future/")
  file.remove("2_Data/0_raw_data/future/download.zip")
}

Getting Tree Location Data

To produce reliable climate envelopes we would like to obtain as many trees from various biogeographical regions of Europe. This section details how we process them and produces a database with these occurrences:

Tree occurences in treeful per source database

Note

We are not doing any sampling on tree occurrences. This means that small areas with many tree records will produce a higher density in the climate envelopes. In the current state of our app and plots, this does reflect to the user.

Academic data sources

We used the TRY plant trait database and EU Forest. Both contain tree locations and botanical names. Both are academic, high-quality datasets with extensive data cleaning in place. We used EU Forest to generate our master list of botanical names. These trees will subsequently be used. A simple fuzzy matching is used to ensure Sorbus Torminalis and Sorbus-torminalis are matched as Sorbus torminalis (We used a conservative string dist=1).

Code
############ Load Packages ############ 
if(!require(librarian)) install.packages("librarian")
library(librarian)
shelf(data.table,stringr,stringdist,fuzzyjoin,dplyr)

############ Define Function for fuzzy matching ############ 
flatten_names <- function(tree_name) {
  tree_name <-  str_replace_all(tree_name," ", "") %>%
    str_replace_all(.,"\\.","") %>%
    str_replace_all(.,"\\-","") %>%
    str_replace_all(.,"\\'","") %>%
    tolower(.)
}


get_fuzzy_plant_names <- function(plant_database, master_list, max_string_dist=3){
# Prepare for matching
# Extract unique names
database_names <- plant_database[,.(latin_name=unique(latin_name))]

# Flatten names
master_list[,flat_name:=flatten_names(latin_name)]
database_names[,flat_name:=flatten_names(latin_name)]

# Match
stringdist <- as.data.table(
                stringdist_join(database_names, master_list, 
                by = c("flat_name" = "flat_name"), 
                mode = "inner", 
                method = "dl", 
                max_dist = max_string_dist, 
                distance_col = "dist"))
# Clean matching
stringdist <- stringdist[,.(data_set_name=latin_name.x,master_list_name=latin_name.y,distance=dist)]
stringdist <- setorder(stringdist,-distance)
return(stringdist)
# As an example of possible other matching functions:
# amatch <-  amatch(database_names[,flat_name], master_list[,flat_name], maxDist = 0.1)
}
############ MAIN: Create master list of botanical names ############ 
# Load datasets
# Use trees4EU as master list, more coverage
master_list <- data.table::fread("2_Data/0_raw_data/EUforestspecies_AMauri.csv") %>% 
  janitor::clean_names() %>% 
  dplyr::select(latin_name = species_name) %>% 
  distinct()
master_list <- na.omit(master_list)

############ OPEN TREES DATA BASE ############ 
# Load and prepare database

# Load database
open_trees_db <- fread("2_Data/1_output/all_merged.csv")
setnames(open_trees_db,"species","latin_name")
open_trees_db <- na.omit(open_trees_db)

# Match db names
max_string_dist = 1 
open_trees_matching_table<- get_fuzzy_plant_names(open_trees_db, master_list, max_string_dist)
open_trees_matching_table_selection <- open_trees_matching_table[distance<=1,]
# Filter database
open_trees_db_selection <- open_trees_db[open_trees_matching_table_selection[,.(data_set_name,master_list_name)], on = c(latin_name = "data_set_name"), nomatch = NULL]
# open_trees_db_selection2 <- open_trees_db[latin_name %in% open_trees_matching_table_selection[,data_set_name]]

############ TRY DATA BASE ############ 
# Load database
if (file.exists("2_Data/0_raw_data/tree_georef_3.txt")) {
  try_trees <- data.table::fread("2_Data/0_raw_data/tree_georef_3.txt") %>% 
    janitor::clean_names()
  
  setnames(try_trees,"acc_species_name","latin_name")
  try_trees <- na.omit(try_trees)
  
  
  # Match db names
  max_string_dist = 1 
  try_trees_matching_table<- get_fuzzy_plant_names(try_trees, master_list, max_string_dist)
  try_trees_matching_table_selection <- try_trees_matching_table[distance<=1,]
  # Filter database
  try_trees_selection <- try_trees[try_trees_matching_table_selection[,.(data_set_name,master_list_name)], on = c(latin_name = "data_set_name"), nomatch = NULL]
  # open_trees_db_selection2 <- open_trees_db[latin_name %in% open_trees_matching_table_selection[,data_set_name]]
  
  
} else {
  try_trees_selection <- tibble(master_list_name = character(), 
                                tree_georef_1_std_value = numeric(),
                                obs_data_std_value = numeric())
}


############ Trees4F DATA BASE ############ 
# Load and prepare database

# Load database
trees4f_db <- data.table::fread("2_Data/0_raw_data/EUforestspecies_AMauri.csv") %>% 
  janitor::clean_names()

setnames(trees4f_db,"species_name","latin_name")
trees4f_db <- na.omit(trees4f_db)

# Match db names
max_string_dist = 3 
trees4f_db_matching_table<- get_fuzzy_plant_names(trees4f_db, master_list, max_string_dist)
trees4f_db_matching_table_selection <- trees4f_db_matching_table[distance<=1,]
# Filter database
trees4f_db_selection <- trees4f_db[trees4f_db_matching_table_selection[,.(data_set_name,master_list_name)], on = c(latin_name = "data_set_name"), nomatch = NULL]
# open_trees_db_selection2 <- open_trees_db[latin_name %in% open_trees_matching_table_selection[,data_set_name]]




# enhance master list with GBIF taxo IDs and write master list to file
tree_master_list <- master_list %>% 
  mutate(gbif_taxo_id = name_backbone_checklist(name=.$latin_name)$usageKey) %>% 
  # remove  unmatched or genus level taxo matches
  filter(str_length(gbif_taxo_id) > 5 & !is.na(gbif_taxo_id))

fwrite(tree_master_list,"2_Data/1_output/eu_native_trees_master.csv")
tree_master_list <- fread("2_Data/1_output/eu_native_trees_master.csv") 


rm(open_trees_db, open_trees_matching_table, open_trees_matching_table_selection, try_trees, try_trees_matching_table, 
   try_trees_matching_table_selection, trees4f_db, trees4f_db_matching_table, trees4f_db_matching_table_selection
   )

#EOF

We finish this file with rgbif::name_backbone_checklist() where we obtain the GBIF taxo ID for each botanical name. This is part of our master data list.

European Tree Cadastres

We wanted to include the proliferating corpus of tree cadastres of European cities. We collected two dozen or so data sources into this file. Since they’re all unharmonized, extensive data cleaning happens here (we did not deal with CRS transformations but only kept datasets with EPSG:4326):

Code
#############
# This script will load various tree location data into one. 
# Goal: have all in one db with harmonized botanical species name, X and Y coordinate in EPSG 4326


tree_dbs <- read_xlsx("2_Data/0_raw_data/opendata_trees.xlsx") %>% 
  janitor::clean_names() %>% 
  filter(suitable == "y" & epsg == "4326") %>% 
  mutate(location = janitor::make_clean_names(location)) %>% 
  mutate(botanical_col = tolower(botanical_col)) 
  # mutate(lon_col = janitor::make_clean_names(lon_col)) %>% 
  # mutate(lat_col = janitor::make_clean_names(lat_col)) 


tree_dbs <- data.table(tree_dbs)
for (i in tree_dbs[,file_name]) {
  utils::unzip(zipfile=paste("2_Data/0_raw_data/tree_cadastres/zip/",i,".zip",sep=""))
}

for (i in 1:nrow(tree_dbs)) {

  ifelse(str_detect(tree_dbs$file_name[i], "csv"),
         assign(tolower(tree_dbs$location[i]),
                janitor::clean_names(read_delim(paste0("2_Data/0_raw_data/tree_cadastres/", tree_dbs$file_name[i])))),
         assign(tolower(tree_dbs$location[i]),
                janitor::clean_names(bind_cols(jsonlite::read_json(paste0("2_Data/0_raw_data/tree_cadastres/", tree_dbs$file_name[i]), simplifyVector = TRUE)$features$properties,
                                     jsonlite::read_json(paste0("2_Data/0_raw_data/tree_cadastres/", tree_dbs$file_name[i]), simplifyVector = TRUE)$features$geometry)
                                     )
                )
  )
  print(i)
}



# # Compress all the csv files with gzip
# tree_dbs <- data.table(tree_dbs)
# for (i in tree_dbs[,file_name]) {
#      db <- fread(paste("2_Data/0_raw_data/tree_cadastres/",i,sep=""))
#      fwrite(db,paste("2_Data/0_raw_data/tree_cadastres/",i,".gz",sep=""),compress="gzip")
# }

# tree_dbs[,file_name:=paste(file_name,".gz",sep="")]
# write.xlsx2(tree_dbs, "2_Data/0_raw_data/opendata_trees.xlsx", sheetName = "Sheet1", 
#             col.names = TRUE, row.names = TRUE, append = FALSE)

# 
# tree_dbs <- data.table(tree_dbs)
# for (i in tree_dbs[,location]) {
#     print(i)
#     assign(tolower(i), paste(fread(paste("2_Data/0_raw_data/tree_cadastres/",tree_dbs[location==i,file_name],sep=""))))
# }
              
# too small data to automate. manually merge mutate
#split_species <- filter(tree_dbs, str_detect(botanical_col, "\\+"))
#mutate(tourcoing, merged_species = paste0(genre, " ", espece)) %>% write_csv(file = "data/opentrees/tourcoing.csv")

merged_geo <- filter(tree_dbs, lon_col == lat_col)
sep_geo <- filter(tree_dbs, lon_col != lat_col)


#######
# unmerge merged geographies
for (i in 1:nrow(merged_geo)) {
  temp_db <- get(tolower(merged_geo$location[i]), envir = globalenv()) %>% 
    dplyr::select(matches(paste0(merged_geo$botanical_col[i], "|", merged_geo$lon_col[i]), ignore.case = TRUE)) 
  
  temp_db <- temp_db %>% 
    separate(merged_geo$lon_col[i], into = c("x", "y"), sep = ",") %>% 
    mutate(x = str_remove(x, "c\\("), y = str_remove(y, "\\)")) %>% 
    rename(species = merged_geo$botanical_col[i])
  
  assign(tolower(merged_geo$location[i]), temp_db)
  print(i)
  print(merged_geo$location[i])
}

# treat regular lat lon sources
for (i in 1:nrow(sep_geo)) {
  temp_db <- get(tolower(sep_geo$location[i]), envir = globalenv()) %>% 
    dplyr::select(matches(paste0(sep_geo$botanical_col[i], "|", sep_geo$lon_col[i], "|", sep_geo$lat_col[i]), ignore.case = TRUE))
  
  temp_db <- temp_db %>% 
    rename(species = sep_geo$botanical_col[i], 
           x = sep_geo$lat_col[i],
           y = sep_geo$lon_col[i]
           )
  
  assign(tolower(sep_geo$location[i]), temp_db)
  print(i)
  print(sep_geo$location[i])
}

#bring it all together into one df
all_trees <- tibble()

for (i in 1:nrow(tree_dbs)) {
  temp_db <- get(tolower(tree_dbs$location[i]), envir = globalenv()) %>% 
    dplyr::select(x, y, species)
  
  temp_db <- temp_db %>% 
    mutate(x = as.numeric(x)) %>% 
    mutate(y = as.numeric(y)) %>% 
    mutate(db_city_origin = tree_dbs$location[i])
  
  assign(tolower(tree_dbs$location[i]), temp_db)
  all_trees <- bind_rows(all_trees, temp_db)
  print(i)
  print(tree_dbs$location[i])
}

all_trees_centroids <- all_trees %>% 
  st_as_sf(coords = c("y", "x"), na.fail = F) %>% 
  group_by(db_city_origin) %>% 
  summarise(st_union(geometry)) %>%
  st_centroid() 

#### save it all on disk, 4 million trees
write_csv(all_trees, "2_Data/1_output/all_merged.csv")


#"specie|especi|nom_lat|latboomsoort|espece|
#nom_cientific|wetenschappelijke_naam|classe|dendro_taxon_tid|
#  nome_scien|traeart|SPECIE|sortiment|essence_scient|gattung_lat|species|gattung|geo_point"

# all tree lists saved as csv in data folder. why csv?
# cause majority of opendata portal tree lists were in csv so its least cumbersome conversion
# also csv means less characters and therefore less data when things become very large. 

This writes a file with only botanical name, X, Y and source database for the next step.

GBIF

The largest amount of our tree data comes from GBIF. We simply query a bounding box of Europe for all species from our master list, enriched already with GBIF taxo IDs. You could re-generate the dataset with most recently added tree occurrences with this (GBIF takes a while to prepare the dataset and you need to have your env variables GBIF_EMAIL, GBIF_PWD, GBIF_USER configured):

gbif_download <- occ_download(
    pred_in("taxonKey", tree_master_list$gbif_taxo_id),
    #pred("taxonKey", 5284884),
    # this is the bounding box of europe
    pred_within("POLYGON((-15 75,-15 30,40 30,40 75,-15 75))"),
    pred_lt("coordinateUncertaintyInMeters",1000), #downstream processing needs 1km accuracy
    pred("hasCoordinate", TRUE),
    pred("hasGeospatialIssue", FALSE), # remove GBIF default geospatial issues
    pred("occurrenceStatus","PRESENT"),
    pred_gte("year", 1960), #only keep trees seen after 1960
    format = "SIMPLE_CSV")
  #
   occ_download_wait(gbif_download)

Getting Bioclimatic Variables

We tested CHELSA, worldclim and Copernicus CDS to get bioclimatic variables. We prefer Copernicus for their extensive documentation and the convenient time frame of past bioclimatic data, from 1979-2018. This is where we expect that largest overlap with the life span of our trees from the occurrence dataset.

Our functions getpastclimate() and getfutureclimate() can theoretically be used to switch bioclimatic data sources but we have not implemented those. These functions are used to read our Copernicus raster files.

Reading Climate Rasters into R

In order to extract bioclimatic variables from each species location, we’ll have to load raster files into R. The function described here will solve a few complexities:

  • We convert raster values into units fit for anlysis.
  • Stacked rasters need to be treated accordingly.
  • Efficiency matters greatly here, especially when it comes to RAM-efficient raster reading.

What could and probably should not be done here: CRS-reprojections. Sometimes you may get a different CRS. Reproject outside of your ETL pipeline and then read.

Code
##################### Function to Get  climate Rasters ##################
# In theory they can easily be adjusted to other climate raster providers. For some, snippets exist in the functions already but probably wont work 
# in the entire workflow of treeful. All processing here is only tested with Copernicus
# you pass a provider and a bioclimatic var like bio01 or bio13 to the function and it returns one raster. you can stack them later on. 


getpastclimate <- function(source = "copernicus", bioclim = "bio01") {
  if (source == "climateeu") {
    bio_path <- case_when(bioclim == "bio01" ~ "MAT",
                          bioclim == "bio12" ~ "MAP",
                          )
    
    bio_raster <- raster(paste0("data/climateEU/Normal_1961-1990_bioclim/", bio_path, ".asc"))

    raster::crs(bio_raster) <- "+proj=aea +lat_0=30 +lon_0=10 +lat_1=43 +lat_2=62 +x_0=0 +y_0=0 +ellps=intl +units=m +no_defs +type=crs"
    
    bio_raster <- raster::projectRaster(bio_raster, crs = 4326)
    
  }
  
  if (source == "copernicus") {
    # Get bioclimate data from copernicus. Download bioclimate file with login at https://cds.climate.copernicus.eu/
    
    bio_path <- toupper(bioclim)

    bio_raster <- terra::rast(paste0("2_Data/0_raw_data/past/", bio_path, "_era5-to-1km_1979-2018-mean_v1.0.nc"))
    # convert bioclim as per copernicus documentation. for some reasone case_when does not work here.     
    if (bioclim %in% c("bio01", "bio05", "bio06", "bio08", "bio09", "bio10", "bio11")) 
    {bio_raster <- bio_raster - 273.15
    } else if (bioclim == "bio12") {bio_raster <- bio_raster*3600*24*365*1000
    } else if (bioclim %in% c("bio13", "bio14")) {bio_raster <- bio_raster*3600*24*30.5*1000
    } else if (bioclim %in% c("bio16", "bio17", "bio18", "bio19")) {bio_raster <- bio_raster*3600*24*91.3*1000
    }
    # a bit unclear if bio13-bio19 can and should also be comverted like bio12. probably not as theyre not on annual reference period

    
  }
  # worldclim would be great cause it fetches all 19 bioclimatic at once. and has a great time range from 1970-2000. 
  # but units of each var are stange. 

  if (source == "worldclim") {
    
    bio_path <- case_when(bioclim == "bio01" ~ "bio_1",
                          bioclim == "bio02" ~ "bio_2",
                          bioclim == "bio03" ~ "bio_3",
                          bioclim == "bio04" ~ "bio_4",
                          bioclim == "bio05" ~ "bio_5",
                          bioclim == "bio06" ~ "bio_6",
                          bioclim == "bio07" ~ "bio_7",
                          bioclim == "bio08" ~ "bio_8",
                          bioclim == "bio09" ~ "bio_9",
                          bioclim == "bio10" ~ "bio_10",
                          bioclim == "bio11" ~ "bio_11",
                          bioclim == "bio12" ~ "bio_12",
                          bioclim == "bio13" ~ "bio_13",
                          bioclim == "bio14" ~ "bio_14",
                          bioclim == "bio15" ~ "bio_15",
                          bioclim == "bio16" ~ "bio_16",
                          bioclim == "bio17" ~ "bio_17",
                          bioclim == "bio18" ~ "bio_18",
                          bioclim == "bio19" ~ "bio_19"
    )
    
    bio_raster <- raster(paste0("2_Data/1_output/worldclim_cropped/wc2.1_30s_", bio_path, ".tif"))
    

  }
  
  if (source == "chelsa") {
    
    bio_path <- case_when(bioclim == "bio01" ~ "bio1",
                          bioclim == "bio12" ~ "bio12"
    )
    
    bio_raster <- raster(paste0("2_Data/1_output/CHELSA_cropped/CHELSA_", bio_path, "_1981-2010_V.2.1.tif"))
    
    if (bioclim == "bio01") {bio_raster <- raster::calc(bio_raster, function(x) { x / 10 - 273.15 })
    } else if (bioclim == "bio12") {
      bio_raster <- raster::calc(bio_raster, function(x) { x / 10})
    }
    
    
  }
  
  
  return(bio_raster)
  rm(bio_raster)
  
}


#################### Get Future Climate ##################
# for now using climate projection model MPI-ESM1-2-LR and socio-econ pathway 245 

getfutureclimate <- function(source = "copernicus", bioclim = "bio01", experiment = "rcp45", future_date = `2050-01-01`) {
  if(source == "chelsa") {
    future_raster <- raster::stack(c("2_Data/1_output/CHELSA_cropped/CHELSA_bio1_2041-2070_gfdl-esm4_ssp370_V.2.1.tif", 
                    "2_Data/1_output/CHELSA_cropped/CHELSA_bio12_2041-2070_gfdl-esm4_ssp370_V.2.1.tif"))
    names(future_raster)[1] <- "bio01"
    names(future_raster)[2] <- "bio12"
    future_raster$bio01 <- raster::calc(future_raster$bio01, function(x) { x / 10 - 273.15 })
    future_raster$bio12 <- raster::calc(future_raster$bio12, function(x) { x / 10})
    
  } else if (source == "worldclim") {
    future_raster <- geodata::cmip6_tile(model = "MPI-ESM1-2-LR", lon = 11.01684, lat = 51.28691, 
                                         ssp = "245", time = "2041-2060", var = "bioc", path = "2_Data/0_raw_data/", res = 5)
    names(future_raster)[1] <- "bio01"
    names(future_raster)[12] <- "bio12"
  } else if (source == "copernicus") {

    bio_path <- toupper(bioclim)

    bio_raster <- terra::rast(paste0("2_Data/0_raw_data/future/", bio_path, "_noresm1-m_", experiment, "_r1i1p1_1960-2099-mean_v1.0.nc"))
    #$X2050.01.01
    names(bio_raster) <- terra::time(bio_raster)
    #bio_raster <- bio_raster$`2050-01-01`
    bio_raster <- terra::subset(bio_raster, future_date)
    # convert bioclim as per copernicus documentation. for some reason case_when does not work here.     
    if (bioclim %in% c("bio01", "bio05", "bio06", "bio08", "bio09", "bio10", "bio11")) 
    {bio_raster <- bio_raster - 273.15
    } else if (bioclim == "bio12") {bio_raster <- bio_raster*3600*24*365*1000
    } else if (bioclim %in% c("bio13", "bio14")) {bio_raster <- bio_raster*3600*24*30.5*1000
    } else if (bioclim %in% c("bio16", "bio17", "bio18", "bio19")) {bio_raster <- bio_raster*3600*24*91.3*1000
    }
    
  }
  
  return(bio_raster)
  rm(bio_raster)
}

getsoilproperties <- function(variable = "STU_EU_DEPTH_ROOTS") {
  soil_layer <- terra::rast(paste0("2_Data/0_raw_data/soil/", variable, "_4326.tif"))
  return(soil_layer)
  rm(soil_layer)
}

Merging Tree Locations and Bioclimatic Variables

On efficient functions and lots of RAM

Make Data available to Shiny App

Tables created in this ETL pipeline are written to a PostGIS DB. To make raster files available to shiny you have two (actually more like one) options:

Option 1: Writing Raster files to PostGIS DB

The bioclimatic and soil rasters used in this project are several GBs large and usually do not fit into memory. This snipped reads them as raster stack, writes the stack to the PostGIS DB and removes it.

Warning

This section relies on the package rpostgis. Currently, there’s no simple other way to write raster data to postGIS from R, neither sf nor terra nor stars. See this discussion. We therefore switched to the approach below: the Shiny app reads values directly from raster files on disk. Slightly less performant for our use case.

Code
print("large data transfer out starting. Writing all Rasters as rasterstack to Postgres. WE DO NOT RECOMMEND THIS APPROACH")

con <- backend_con()

# 
# pastbio01 <- getpastclimate(source = "copernicus", bioclim = "bio01")
# 
# 
# pastbio12 <- getpastclimate(source = "copernicus", bioclim = "bio12")
# crs(pastbio12) <- "+proj=longlat +datum=WGS84 +no_defs +type=crs"
# # we set this proj here and it seems to stick to R raster object. 
# # when writing into postgis, there SRID appears to be 3395

########################## Past Raster ############################
if (!RPostgres::dbExistsTable(conn = con, name = "past")) {
  sendstatus("Reading in Copernicus past raster")
  past <- raster::stack(raster::raster(getpastclimate(source = "copernicus", bioclim = "bio01")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio02")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio03")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio04")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio05")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio06")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio07")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio08")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio09")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio10")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio11")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio12")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio13")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio14")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio15")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio16")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio17")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio18")),
                        raster::raster(getpastclimate(source = "copernicus", bioclim = "bio19"))
  )
  sendstatus("Writing Copernicus Past to DB")
  rpostgis::pgWriteRast(con,
                        name = "past", raster = past, overwrite = TRUE
  )
  rm(past)
  gc()
  } else {
    cat("Raster layers exist in Postgres. Skipping. ")
  }
  
########################## Future 2050 4.5 Raster ############################
if (!RPostgres::dbExistsTable(conn = con, name = "future_2050_4_5")) {
  sendstatus("starting with reading Copernicus Future 2050 4.5")
  future <- raster::stack(raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio01", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio02", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio03", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio04", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio05", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio06", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio07", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio08", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio09", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio10", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio11", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio12", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio13", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio14", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio15", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio16", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio17", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio18", experiment = "rcp45", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio19", experiment = "rcp45", future_date = "2050-01-01"))
  )
  
  sendstatus("Writing Copernicus Future 2050 4.5")
  rpostgis::pgWriteRast(con,
                        name = "future_2050_4_5", raster = future, overwrite = TRUE
  )
  rm(future)
  gc()
  } else {
    cat("Raster layers exist in Postgres. Skipping. ")
}
  
  
########################## Future 2050 8.5 Raster ############################
if (!RPostgres::dbExistsTable(conn = con, name = "future_2050_8_5")) {
  sendstatus("starting with reading Copernicus Future 2050 8.5")
  future <- raster::stack(raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio01", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio02", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio03", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio04", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio05", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio06", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio07", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio08", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio09", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio10", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio11", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio12", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio13", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio14", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio15", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio16", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio17", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio18", experiment = "rcp85", future_date = "2050-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio19", experiment = "rcp85", future_date = "2050-01-01"))
  )
  
  sendstatus("Writing Copernicus Future 2050 8.5")
  rpostgis::pgWriteRast(con,
                        name = "future_2050_8_5", raster = future, overwrite = TRUE
  )
  rm(future)
  gc()
} else {
  cat("Raster layers exist in Postgres. Skipping. ")
}

########################## Future 2070 4.5 Raster ############################
if (!RPostgres::dbExistsTable(conn = con, name = "future_2070_4_5")) {
  sendstatus("starting with reading Copernicus Future 2070 4.5")
  future <- raster::stack(raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio01", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio02", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio03", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio04", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio05", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio06", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio07", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio08", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio09", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio10", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio11", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio12", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio13", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio14", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio15", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio16", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio17", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio18", experiment = "rcp45", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio19", experiment = "rcp45", future_date = "2070-01-01"))
  )
  
  sendstatus("Writing Copernicus Future 2070 4.5")
  rpostgis::pgWriteRast(con,
                        name = "future_2070_4_5", raster = future, overwrite = TRUE
  )
  rm(future)
  gc()
} else {
  cat("Raster layers exist in Postgres. Skipping. ")
}


########################## Future 2070 8.5 Raster ############################
if (!RPostgres::dbExistsTable(conn = con, name = "future_2070_8_5")) {
  sendstatus("starting with reading Copernicus Future 2070 8.5")
  future <- raster::stack(raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio01", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio02", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio03", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio04", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio05", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio06", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio07", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio08", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio09", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio10", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio11", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio12", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio13", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio14", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio15", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio16", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio17", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio18", experiment = "rcp85", future_date = "2070-01-01")),
                          raster::raster(getfutureclimate(source = "copernicus", bioclim = "bio19", experiment = "rcp85", future_date = "2070-01-01"))
  )
  
  sendstatus("Writing Copernicus Future 2070 8.5")
  rpostgis::pgWriteRast(con,
                        name = "future_2070_8_5", raster = future, overwrite = TRUE
  )
  rm(future)
  gc()
} else {
  cat("Raster layers exist in Postgres. Skipping. ")
}
  
  
  print("Reading in Soil Rasters")
  soil <- raster::stack(raster::raster(getsoilproperties("STU_EU_DEPTH_ROOTS")),
                        raster::raster(getsoilproperties("STU_EU_T_CLAY")),
                        raster::raster(getsoilproperties("STU_EU_S_CLAY")),
                        raster::raster(getsoilproperties("STU_EU_T_SAND")),
                        raster::raster(getsoilproperties("STU_EU_S_SAND")),
                        raster::raster(getsoilproperties("STU_EU_T_SILT")),
                        raster::raster(getsoilproperties("STU_EU_S_SILT")),
                        raster::raster(getsoilproperties("STU_EU_T_OC")),
                        raster::raster(getsoilproperties("STU_EU_S_OC")),
                        raster::raster(getsoilproperties("STU_EU_T_BD")),
                        raster::raster(getsoilproperties("STU_EU_S_BD")),
                        raster::raster(getsoilproperties("STU_EU_T_GRAVEL")),
                        raster::raster(getsoilproperties("STU_EU_S_GRAVEL")),
                        raster::raster(getsoilproperties("SMU_EU_T_TAWC")),
                        raster::raster(getsoilproperties("SMU_EU_S_TAWC")),
                        raster::raster(getsoilproperties("STU_EU_T_TAWC")),
                        raster::raster(getsoilproperties("STU_EU_S_TAWC"))
  )
  
  
  print("Writing Soil to DB")
  rpostgis::pgWriteRast(con,
                        name = "soil", raster = soil, overwrite = TRUE
  )
  rm(soil)
  gc()
  




DBI::dbDisconnect(conn = con)

Option 2: Mount raster files into Shiny container

The raster files downloaded from Copernicus CDS can simply be mounted into the Docker container.

Enrich Species Master Table

This section explains how to make your species distribution data more user friendly. For a specialist audience, you may be able to expose simply botanical names. Here we’re adding descriptions and images for species with a simple wikipedia scraper.

Code
library(librarian)
shelf(rvest)
# load list with botanical names and taxonomy IDs
tree_master_list <- fread("2_Data/1_output/eu_native_trees_master.csv")

# a simple function to search wikipedia for botanical names
wikipedia <- function(search_terms, lang = c("en", "de", "es", "fr")) {
    # if system language contains "en" use English Wikipedia version
    if (grepl("en", lang)) {
      return(paste0("https://en.wikipedia.org/w/index.php?search=", URLencode(search_terms)))
    }
    
    # if system language contains "de" use German Wikipedia version
    else if (grepl("de", lang)) {
      return(paste0("https://de.wikipedia.org/w/index.php?search=", URLencode(search_terms)))
    }
    
    
    # if "lang" is not specified and default system language is not
    # English, German, Spanish or French, use the English version
    else {
      return(paste0("https://en.wikipedia.org/w/index.php?search=", URLencode(search_terms)))
    }
  }

# scrape wikipedia for first paragraph and image
for (i in 1:nrow(tree_master_list)) {
  tree_url <- wikipedia(tree_master_list$latin_name[i], "de")
  tree_page <-  tree_url %>% 
    rvest::read_html()
  
  tree_tribble <- tibble::tribble(~descr_de, ~url, ~image_url,
                  rvest::html_text(rvest::html_element(tree_page, xpath = "/html/body/div[3]/div[3]/div[5]/div[1]/p[1]"), trim = TRUE),
                  tree_url,
                  rvest::html_attr(html_element(tree_page, xpath = "/html/body/div[3]/div[3]/div[5]/div[1]/table[1]/tbody/tr[2]/td/span/a/img"), "src")
                  ) 
  tree_master_list$descr_de[i] <- tree_tribble$descr_de
  tree_master_list$url[i] <- tree_tribble$url
  tree_master_list$image_url[i] <- tree_tribble$image_url
  cat("fetched ", tree_tribble$descr_de, " \n for ", tree_master_list$latin_name[i])
  # be kind to wikipedia
  Sys.sleep(1.2)
}

tree_master_list <- tree_master_list %>% 
  mutate(image_url = stringr::str_remove(image_url, "^\\/\\/"))

sendstatus("writing master list with drescriptions to postgres")

fwrite(tree_master_list, "2_Data/1_output/eu_native_trees_master.csv")
con <- backend_con()
RPostgres::dbWriteTable(con, "tree_master_list", tree_master_list, overwrite = TRUE)

DBI::dbDisconnect(conn = con)

Calculate Percentile Ranges from Mean

We have used the sampled and well documented dataset EU Forest to calculate percentiles ranges. Bioclimatic variables of tree occurrence do not yield a normal distribution, hence we are calculating the distance from mean for each species and variable. The Eu Forest dataset is sampled and contains enough occurrences, while occurrences from crowd-sourced efforts (such as GBIF) are more prone to distorted data (showing where trees have been mapped rather than where trees occur). We have used a cutoff of n>200 occurrences from EU Forest As per this source. Below are histograms for a few sample biclimatic variables: Annual BIO01 (Mean Temperature), BIO5 (Max Temperature of Warmest Month), BIO6 (Min Temperature of Coldest Month), BIO8 (Mean Temperature of Wettest Quarter), BIO9 (Mean Temperature of Driest Quarter), BIO12 (Annual Precipitation), BIO13 (Precipitation of Wettest Month), BIO14 (Precipitation of Driest Month)

Histogram of BIO01 of tree occurences

Histogram of BIO05 of tree occurences

Histogram of BIO06 of tree occurences

Histogram of BIO08 of tree occurences

Histogram of BIO09 of tree occurences

Histogram of BIO12 of tree occurences

Histogram of BIO13 of tree occurences

Histogram of BIO14 of tree occurences

Shiny App Development

The shiny app itself is developed as package using the golem framework. Most of the server and ui code explains itself.

Database Deployment

We use the docker image postgis/postgis and only add raster extension to it with a short init script.

Code
from postgis/postgis
# need to enable raster extentions


COPY ./initialize-raster.sh /docker-entrypoint-initdb.d/initialize-raster.sh
# CREATE EXTENSION postgis_raster;
# SET postgis.gdal_enabled_drivers = 'ENABLE_ALL';
# SELECT name, default_version,installed_version
# FROM pg_available_extensions WHERE name LIKE 'postgis%' or name LIKE 'address%';

Shiny App Deployment

Thanks to our Postgis database, the Shiny app itself is rather lightweight, does not need large files to be transferred and only does a minimum of plotting. The docker-compose.yml file specifies that both the backend postgres database and the frontend shiny application will be launched. You could also run them on separate hosts.

  • Generate Dockerfile from app. Golem handles this. It is wise to double-check the Dockerfile before buidling. When adding new packages, ensure they’re at the end of the Dockerfile in order to leverage build caching.
  • On your host: clone the repository with git clone https://github.com/3ful/treeful.git and move to the directory with cd ./treeful
  • Run stack with swarm and compose
    • launch your docker swarm: docker swarm init
    • In order to serve the same image to all nodes you need a docker registry. Create one with: docker service create --name registry --publish published=5000,target=5000 registry:2
    • Push images to registry: docker compose push
    • Now, deploy your swarm: docker stack deploy --compose-file docker-compose.yml treeful
    • Now you can easily scale up your shiny app to 3 replicas with docker service scale treeful_frontend=3
    • You can watch visitors of your frontend app with docker service logs treeful_frontend -f